Initialising ...
Initialising ...
Initialising ...
Initialising ...
Initialising ...
Initialising ...
Initialising ...
Onodera, Naoyuki; Idomura, Yasuhiro; Ali, Y.*
no journal, ,
A real-time simulation of the environmental dynamics of radioactive substances is very important from the viewpoint of nuclear security. Since a lot of tall buildings and complex structures make the air flow turbulent in urban cities, large-scale CFD simulations are needed. To this end, a CFD code based on a Lattice Boltzmann Method (LBM) with a block-based Adaptive Mesh Refinement (AMR) method is developed. As the conventional LBM based on a single relaxation time collision operator often becomes numerically unstable at high Reynolds number, we apply a state-of-the-art cumulant collision operator. The code is developed on a GPU cluster at JAEA. By using new functions in CUDA8.0, the GPU kernel functions are tuned to achieve high performance on the latest Pascal GPU architecture. By introducing a temporal blocking technique, we achieve a high performance of 488 MLUPS per a GPU, and the number of the MPI communications is significantly reduced.
Shimokawabe, Takashi*; Onodera, Naoyuki
no journal, ,
Recently grid-based physical simulations with multiple GPUs require effective methods to adapt grid resolution to certain sensitive regions of simulations. In the GPU computation. An Adaptive Mesh Refinement (AMR) method is one of the effective methods to compute certain local regions that demand higher accuracy with higher resolution. We are developing a block-based AMR framework for stencil applications written in C++ and CUDA. Programmers just write the stencil functions that update a grid point on Cartesian grid. The framework executes these functions over a tree-based AMR data structure effectively. The framework supports multiple GPUs and provides C++ classes to exchange halo regions and migrate data between GPUs. In this paper, we describe the programming model and implementation of the AMR framework for multiple GPUs, and show the computation results of the compressive fluid calculation based on the proposed AMR framework.
Ohashi, Kunihide*; Onodera, Naoyuki
no journal, ,
The applicability of shared memory type computing with Xeon Phi Knights Landing processor is examined with the flow solver which is dedicated to the flow computation around a ship hull. The computations at the high Reynolds number with the one- or two-equation turbulence models and the free surface model, additionally, the over-set grid method are selected as the test cases. The processing speeds of the Xeon Broadwell and the Skylake processors show better results than the speed of the Xeon Phi processors. The speed up ratio is smoothly increasing until the maximum number of the core with using the Xeon Phi processor.
Tanaka, Masaaki
no journal, ,
Activities of the subcommittee in Atomic Energy Society of Japan (AESJ) for the publication of "Guideline for Credibility Assessment of Nuclear Simulations 2015" are introduced and a case study of VVUQ (verification and validation plus uncertainty quantification) application to the numerical simulation of the T-junction piping system is briefly presented.